The purpose of this report is to analyze and interpret crime and temperature data collected from Colchester city in 2024. This analysis will help identify patterns and correlations between crime incidents and climatic conditions in the area. The datasets utilized in this analysis are
crime24.csv This file contains street-level crime data from Colchester for the year 2024. It was extracted using the interface described at UK Police Crime Data. The dataset includes information such as the category of crime, location details, date, and outcome status.
temp24.csv This file contains daily climate data recorded at a weather station near Colchester in 2024. The data was retrieved using the interface outlined at Ogimet Climate Data. It includes variables such as temperature, humidity, and other meteorological factors.
This report will cover the data cleaning process, exploratory data analysis, data visualization, and interpretation of the findings.
The crime dataset consists of 6,304 rows and 13 variables, while the temperature dataset contains 366 observations and 18 features. In both datasets, the date variable is currently stored in the character format (chr). Therefore, it is necessary to convert this variable to a date-time format to ensure proper handling and analysis
The variables “Context” and “Location Subtype” in the crime data exhibit extremely high missingness, suggesting that they might not be informative and could potentially be excluded from further analysis. The variables “Persistent ID” and “Outcome Status” have moderate missingness, requiring appropriate handling, such as imputation. The remaining variables (“Category”, “Date”, “Lat”, and “Long”) have no missing values and can be utilized directly in the analysis.
miss_var_summary(Crime_data[,2:13])%>%
gt()%>%
gt_theme_guardian()%>%
tab_header(title= 'Missingness of the variables')